1 BRC intro

The BRC is: Basic BRC plug.

2 Introduction to Data Wrangling with Tidy

2.1 A problem we face when dealing with data regularly.

Every dataset is different. Sometimes very different.

There are many ways to do things. Everyone has their favorite syntax.

The issue:
Many fundamental data processing functions exist in Base R and beyond. Sometimes they can be inconsistent or unnecessarily complex. The result is code that is confusing and doesn’t flow i.e. nested functions

2.2 What does it mean to be tidy?

Tidyverse is most importantly a philosophy for data analysis that more often then not makes wrangling data easier. The tidyverse community have built what they describe as an “opinionated” group of packages. These packages readily talk to one another.

  • More efficient code
  • Easier to remember syntax
  • Easier to read syntax

Their manifesto: https://cran.r-project.org/web/packages/tidyverse/vignettes/manifesto.html

2.3 What does it actually mean to be tidy?

  • A defined vision for coding style in R
  • A defined vision for data formats in R
  • A defined vision for package design in R
  • Unified set of community pushing in a cohesive direction
  • Critical mass of people to influence the way the whole R community evolves

2.4 What are the main tools in the tidyverse?

  • readr – reading data into R
  • dplyr – manipulating data
  • tibble - working with tibbles
  • tidyr – miscellaneous tools for tidying data
  • ggplot2 – making pretty graphs
  • stringr – working with strings
  • purr - iterating over data
  • forcats - working with factors

Other tools have now been made for the tidy community. This community also overlaps with bioconductor. But the packages above are the linchpins that hold it together.

2.5 What are we doing today

Workflow Image for working with data.

3 Lets get tidy!

3.1 First step lets load in the data we are using today

3.2 Are all data frames equal?

##   salmon_id    common_name  age_classbylength length_mm IGF1_ng_ml
## 1     35032 Chinook salmon           yearling       147   41.26006
## 2     35035 Sockeye salmon           juvenile       121         NA
## 3     35036 Sockeye salmon           juvenile       112         NA
## 4     35037      Steelhead           juvenile       220   42.70981
## 5     35038      Steelhead           juvenile       152         NA
## 6     35033 Chinook salmon mixed age juvenile       444   62.11528
##   salmon_id    common_name  age_classbylength   variable     value
## 1     35032 Chinook salmon           yearling  length_mm 147.00000
## 2     35032 Chinook salmon           yearling IGF1_ng_ml  41.26006
## 3     35033 Chinook salmon mixed age juvenile  length_mm 444.00000
## 4     35033 Chinook salmon mixed age juvenile IGF1_ng_ml  62.11528
## 5     35034 Sockeye salmon           juvenile  length_mm 139.00000
## 6     35034 Sockeye salmon           juvenile IGF1_ng_ml        NA
##   salmon_id    common_name  age_classbylength  variable value
## 1     35032 Chinook salmon           yearling length_mm   147
## 2     35033 Chinook salmon mixed age juvenile length_mm   444
## 3     35034 Sockeye salmon           juvenile length_mm   139
## 4     35035 Sockeye salmon           juvenile length_mm   121
## 5     35036 Sockeye salmon           juvenile length_mm   112
## 6     35037      Steelhead           juvenile length_mm   220
##   salmon_id    common_name  age_classbylength   variable    value
## 1     35032 Chinook salmon           yearling IGF1_ng_ml 41.26006
## 2     35033 Chinook salmon mixed age juvenile IGF1_ng_ml 62.11528
## 3     35034 Sockeye salmon           juvenile IGF1_ng_ml       NA
## 4     35035 Sockeye salmon           juvenile IGF1_ng_ml       NA
## 5     35036 Sockeye salmon           juvenile IGF1_ng_ml       NA
## 6     35037      Steelhead           juvenile IGF1_ng_ml 42.70981

3.3 What is a tidy dataset?

A tidy dataset is a data frame (or table) for which the following are true:

  • Each variable has its own column
  • Each observation has its own row
  • Each value has its own cell

Our first dataframe is tidy

3.4 Why bother?

Consistent dataframe layouts help to ensure that all values are present and that relationships between data points are clear.

R is a vectorized programming language. R builds data frames from vectors, and R works best when its operation are vectorized. Tidy data utilizes of both of these aspects of R.

=> Precise and Fast

3.5 Lets load in the tidyverse

## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

4 These tools have same logic

-> insert graphic here https://github.com/trinker/tidyr_in_a_nutshell

5 dplyr: A tool to access and manipulate data in a dataframe

5.1 Select

Select allows you to make a vector or dataframe from a specific variable or variables

## # A tibble: 97 x 1
##    common_name   
##    <chr>         
##  1 Chinook salmon
##  2 Sockeye salmon
##  3 Sockeye salmon
##  4 Steelhead     
##  5 Steelhead     
##  6 Chinook salmon
##  7 Sockeye salmon
##  8 Steelhead     
##  9 Steelhead     
## 10 Steelhead     
## # … with 87 more rows
## # A tibble: 97 x 2
##    age_classbylength  common_name   
##    <chr>              <chr>         
##  1 yearling           Chinook salmon
##  2 juvenile           Sockeye salmon
##  3 juvenile           Sockeye salmon
##  4 juvenile           Steelhead     
##  5 juvenile           Steelhead     
##  6 mixed age juvenile Chinook salmon
##  7 juvenile           Sockeye salmon
##  8 juvenile           Steelhead     
##  9 juvenile           Steelhead     
## 10 juvenile           Steelhead     
## # … with 87 more rows
## # A tibble: 97 x 4
##    salmon_id common_name    age_classbylength  IGF1_ng_ml
##        <dbl> <chr>          <chr>                   <dbl>
##  1     35032 Chinook salmon yearling                 41.3
##  2     35035 Sockeye salmon juvenile                 NA  
##  3     35036 Sockeye salmon juvenile                 NA  
##  4     35037 Steelhead      juvenile                 42.7
##  5     35038 Steelhead      juvenile                 NA  
##  6     35033 Chinook salmon mixed age juvenile       62.1
##  7     35034 Sockeye salmon juvenile                 NA  
##  8     35048 Steelhead      juvenile                 24.2
##  9     35049 Steelhead      juvenile                 NA  
## 10     35050 Steelhead      juvenile                 63.5
## # … with 87 more rows
## # A tibble: 97 x 3
##    common_name    age_classbylength  length_mm
##    <chr>          <chr>                  <dbl>
##  1 Chinook salmon yearling                 147
##  2 Sockeye salmon juvenile                 121
##  3 Sockeye salmon juvenile                 112
##  4 Steelhead      juvenile                 220
##  5 Steelhead      juvenile                 152
##  6 Chinook salmon mixed age juvenile       444
##  7 Sockeye salmon juvenile                 139
##  8 Steelhead      juvenile                 288
##  9 Steelhead      juvenile                 190
## 10 Steelhead      juvenile                 283
## # … with 87 more rows

5.2 Filter

Filter allows you to access observations based on specific criteria

## # A tibble: 11 x 5
##    salmon_id common_name    age_classbylength length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                 <dbl>      <dbl>
##  1     35035 Sockeye salmon juvenile                121         NA
##  2     35036 Sockeye salmon juvenile                112         NA
##  3     35034 Sockeye salmon juvenile                139         NA
##  4     35144 Sockeye salmon juvenile                140         NA
##  5     35147 Sockeye salmon juvenile                115         NA
##  6     35096 Sockeye salmon juvenile                115         NA
##  7     35097 Sockeye salmon juvenile                110         NA
##  8     35098 Sockeye salmon juvenile                112         NA
##  9     35099 Sockeye salmon juvenile                111         NA
## 10     35100 Sockeye salmon juvenile                118         NA
## 11     35119 Sockeye salmon juvenile                122         NA
## # A tibble: 57 x 5
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35032 Chinook salmon yearling                 147       41.3
##  2     35035 Sockeye salmon juvenile                 121       NA  
##  3     35036 Sockeye salmon juvenile                 112       NA  
##  4     35033 Chinook salmon mixed age juvenile       444       62.1
##  5     35034 Sockeye salmon juvenile                 139       NA  
##  6     35142 Chinook salmon yearling                 149       66.5
##  7     35143 Chinook salmon yearling                 204       80.9
##  8     35144 Sockeye salmon juvenile                 140       NA  
##  9     35145 Chinook salmon yearling                 130       23.4
## 10     35146 Chinook salmon mixed age juvenile       422      101. 
## # … with 47 more rows
## # A tibble: 59 x 5
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35032 Chinook salmon yearling                 147       41.3
##  2     35035 Sockeye salmon juvenile                 121       NA  
##  3     35036 Sockeye salmon juvenile                 112       NA  
##  4     35033 Chinook salmon mixed age juvenile       444       62.1
##  5     35034 Sockeye salmon juvenile                 139       NA  
##  6     35142 Chinook salmon yearling                 149       66.5
##  7     35143 Chinook salmon yearling                 204       80.9
##  8     35144 Sockeye salmon juvenile                 140       NA  
##  9     35145 Chinook salmon yearling                 130       23.4
## 10     35146 Chinook salmon mixed age juvenile       422      101. 
## # … with 49 more rows
## # A tibble: 36 x 5
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35036 Sockeye salmon juvenile                 112       NA  
##  2     35037 Steelhead      juvenile                 220       42.7
##  3     35033 Chinook salmon mixed age juvenile       444       62.1
##  4     35048 Steelhead      juvenile                 288       24.2
##  5     35050 Steelhead      juvenile                 283       63.5
##  6     35051 Steelhead      juvenile                 279       61.2
##  7     35052 Steelhead      juvenile                 235       30.6
##  8     35053 Steelhead      juvenile                 230       49.4
##  9     35056 Steelhead      juvenile                 208       57.4
## 10     35057 Steelhead      juvenile                 240       20.2
## # … with 26 more rows

5.3 Arrange

Arrange sorts the dataframe based on a specific variable or variables

## # A tibble: 97 x 5
##    salmon_id common_name    age_classbylength length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                 <dbl>      <dbl>
##  1     35095 Chinook salmon subyearling              90         NA
##  2     35097 Sockeye salmon juvenile                110         NA
##  3     35099 Sockeye salmon juvenile                111         NA
##  4     35036 Sockeye salmon juvenile                112         NA
##  5     35098 Sockeye salmon juvenile                112         NA
##  6     35147 Sockeye salmon juvenile                115         NA
##  7     35096 Sockeye salmon juvenile                115         NA
##  8     35100 Sockeye salmon juvenile                118         NA
##  9     35035 Sockeye salmon juvenile                121         NA
## 10     35119 Sockeye salmon juvenile                122         NA
## # … with 87 more rows
## # A tibble: 97 x 5
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35033 Chinook salmon mixed age juvenile       444      62.1 
##  2     35146 Chinook salmon mixed age juvenile       422     101.  
##  3     35110 Chinook salmon mixed age juvenile       275      81.5 
##  4     35129 Chinook salmon yearling                 225      72.7 
##  5     35103 Chinook salmon yearling                 216      81.2 
##  6     35115 Chinook salmon yearling                 215      53.5 
##  7     35112 Chinook salmon yearling                 205      90.5 
##  8     35143 Chinook salmon yearling                 204      80.9 
##  9     35079 Chinook salmon yearling                 199      53.2 
## 10     35081 Chinook salmon yearling                 196       5.56
## # … with 87 more rows

5.4 Mutate

Mutate creates a new variable based on some form of computation

## # A tibble: 97 x 6
##    salmon_id common_name  age_classbyleng… length_mm IGF1_ng_ml `scale(IGF1_ng_…
##        <dbl> <chr>        <chr>                <dbl>      <dbl>            <dbl>
##  1     35032 Chinook sal… yearling               147       41.3           -0.258
##  2     35035 Sockeye sal… juvenile               121       NA             NA    
##  3     35036 Sockeye sal… juvenile               112       NA             NA    
##  4     35037 Steelhead    juvenile               220       42.7           -0.191
##  5     35038 Steelhead    juvenile               152       NA             NA    
##  6     35033 Chinook sal… mixed age juven…       444       62.1            0.704
##  7     35034 Sockeye sal… juvenile               139       NA             NA    
##  8     35048 Steelhead    juvenile               288       24.2           -1.04 
##  9     35049 Steelhead    juvenile               190       NA             NA    
## 10     35050 Steelhead    juvenile               283       63.5            0.766
## # … with 87 more rows
## # A tibble: 97 x 6
##    salmon_id common_name  age_classbyleng… length_mm IGF1_ng_ml IGFngml_zscore[…
##        <dbl> <chr>        <chr>                <dbl>      <dbl>            <dbl>
##  1     35032 Chinook sal… yearling               147       41.3           -0.258
##  2     35035 Sockeye sal… juvenile               121       NA             NA    
##  3     35036 Sockeye sal… juvenile               112       NA             NA    
##  4     35037 Steelhead    juvenile               220       42.7           -0.191
##  5     35038 Steelhead    juvenile               152       NA             NA    
##  6     35033 Chinook sal… mixed age juven…       444       62.1            0.704
##  7     35034 Sockeye sal… juvenile               139       NA             NA    
##  8     35048 Steelhead    juvenile               288       24.2           -1.04 
##  9     35049 Steelhead    juvenile               190       NA             NA    
## 10     35050 Steelhead    juvenile               283       63.5            0.766
## # … with 87 more rows

5.5 Summarize

Summarize applies aggregating or summary function to a group

## # A tibble: 4 x 2
##   common_name    count
##   <chr>          <int>
## 1 Chinook salmon    46
## 2 Coho salmon        2
## 3 Sockeye salmon    11
## 4 Steelhead         38
## # A tibble: 4 x 2
##   common_name    IGF1_ng_ml_ave
##   <chr>                   <dbl>
## 1 Chinook salmon           46.8
## 2 Coho salmon              73.6
## 3 Sockeye salmon          NaN  
## 4 Steelhead                46.1

5.6 Group

Grouping can also help ask questions with other functions

## # A tibble: 8 x 5
## # Groups:   common_name [4]
##   salmon_id common_name    age_classbylength length_mm IGF1_ng_ml
##       <dbl> <chr>          <chr>                 <dbl>      <dbl>
## 1     35038 Steelhead      juvenile                152       NA  
## 2     35055 Steelhead      juvenile                123       55.7
## 3     35145 Chinook salmon yearling                130       23.4
## 4     35085 Coho salmon    yearling                140       NA  
## 5     35087 Coho salmon    yearling                164       73.6
## 6     35095 Chinook salmon subyearling              90       NA  
## 7     35097 Sockeye salmon juvenile                110       NA  
## 8     35099 Sockeye salmon juvenile                111       NA
## # A tibble: 95 x 5
## # Groups:   common_name [3]
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35032 Chinook salmon yearling                 147       41.3
##  2     35035 Sockeye salmon juvenile                 121       NA  
##  3     35036 Sockeye salmon juvenile                 112       NA  
##  4     35037 Steelhead      juvenile                 220       42.7
##  5     35038 Steelhead      juvenile                 152       NA  
##  6     35033 Chinook salmon mixed age juvenile       444       62.1
##  7     35034 Sockeye salmon juvenile                 139       NA  
##  8     35048 Steelhead      juvenile                 288       24.2
##  9     35049 Steelhead      juvenile                 190       NA  
## 10     35050 Steelhead      juvenile                 283       63.5
## # … with 85 more rows
## # A tibble: 97 x 6
## # Groups:   common_name [4]
##    salmon_id common_name   age_classbylength length_mm IGF1_ng_ml IGFngml_zscore
##        <dbl> <chr>         <chr>                 <dbl>      <dbl>          <dbl>
##  1     35032 Chinook salm… yearling                147       41.3         -0.236
##  2     35035 Sockeye salm… juvenile                121       NA           NA    
##  3     35036 Sockeye salm… juvenile                112       NA           NA    
##  4     35037 Steelhead     juvenile                220       42.7         -0.176
##  5     35038 Steelhead     juvenile                152       NA           NA    
##  6     35033 Chinook salm… mixed age juveni…       444       62.1          0.652
##  7     35034 Sockeye salm… juvenile                139       NA           NA    
##  8     35048 Steelhead     juvenile                288       24.2         -1.12 
##  9     35049 Steelhead     juvenile                190       NA           NA    
## 10     35050 Steelhead     juvenile                283       63.5          0.890
## # … with 87 more rows

6 Piping (%>%): A way to string together functions together

Piping allows you to pass the result from one expression directly into another.

-> same graphic as before , but extend https://github.com/trinker/tidyr_in_a_nutshell

6.1 Piping versus not piping

## # A tibble: 4 x 2
##   common_name    IGF1_ng_ml_ave
##   <chr>                   <dbl>
## 1 Chinook salmon           46.8
## 2 Coho salmon              73.6
## 3 Sockeye salmon          NaN  
## 4 Steelhead                46.1
## # A tibble: 4 x 2
##   common_name    IGF1_ng_ml_ave
##   <chr>                   <dbl>
## 1 Chinook salmon           46.8
## 2 Coho salmon              73.6
## 3 Sockeye salmon          NaN  
## 4 Steelhead                46.1

6.2 Building pipes together

## # A tibble: 2 x 2
##   common_name    IGF1_ng_ml_ave
##   <chr>                   <dbl>
## 1 Chinook salmon           77.9
## 2 Steelhead                45.3
## # A tibble: 6 x 3
## # Groups:   common_name [4]
##   common_name    size       IGF1_ng_ml_ave
##   <chr>          <chr>               <dbl>
## 1 Chinook salmon big_fish             77.9
## 2 Chinook salmon small_fish           39.5
## 3 Coho salmon    small_fish           73.6
## 4 Sockeye salmon small_fish          NaN  
## 5 Steelhead      big_fish             45.3
## 6 Steelhead      small_fish           47.3
## # A tibble: 4 x 3
## # Groups:   common_name [2]
##   common_name    size       IGF1_ng_ml_ave
##   <chr>          <chr>               <dbl>
## 1 Chinook salmon big_fish             77.9
## 2 Chinook salmon small_fish           39.5
## 3 Steelhead      big_fish             45.3
## 4 Steelhead      small_fish           47.3

7 Readr: Reading data into R

So we blasted through what being tidy can give you. Now lets tidy some data. First step is to read in data.

ReadR:

  • read_csv(): comma separated (CSV) files
  • read_tsv(): tab separated files
  • read_delim(): general delimited files
  • read_fwf(): fixed width files
  • read_table(): tabular files where columns are separated by white-space
  • read_log(): web log files

7.1 ReadR vs Base

##   ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
## 1    350    204       0    103       0
## 2    351  15586     479  10476      39
## 3    353    842     355   1188      86
## 4    354      0       0      0       0
## 5    355    123     291    139      16
## 6    356      1       1      0       0
## Parsed with column specification:
## cols(
##   ENTREZ = col_double(),
##   CD34_1 = col_double(),
##   ORTHO_1 = col_double(),
##   CD34_2 = col_double(),
##   ORTHO_2 = col_double()
## )
## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##     <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
##  1    350    204       0    103       0
##  2    351  15586     479  10476      39
##  3    353    842     355   1188      86
##  4    354      0       0      0       0
##  5    355    123     291    139      16
##  6    356      1       1      0       0
##  7    357    380       3    177       0
##  8    358    572    2225    597    4051
##  9    359      0      12      1       0
## 10    360    320     502     46    1114
## # … with 90 more rows
## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##    <chr>   <int>   <int>  <int>   <int>
##  1 350       204       0    103       0
##  2 351     15586     479  10476      39
##  3 353       842     355   1188      86
##  4 354         0       0      0       0
##  5 355       123     291    139      16
##  6 356         1       1      0       0
##  7 357       380       3    177       0
##  8 358       572    2225    597    4051
##  9 359         0      12      1       0
## 10 360       320     502     46    1114
## # … with 90 more rows

8 Tibbles

8.1 Subsetting tibbles

## # A tibble: 100 x 1
##    ENTREZ
##    <chr> 
##  1 350   
##  2 351   
##  3 353   
##  4 354   
##  5 355   
##  6 356   
##  7 357   
##  8 358   
##  9 359   
## 10 360   
## # … with 90 more rows
## # A tibble: 1 x 5
##   ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##   <chr>   <int>   <int>  <int>   <int>
## 1 350       204       0    103       0
## # A tibble: 100 x 1
##    ENTREZ
##    <chr> 
##  1 350   
##  2 351   
##  3 353   
##  4 354   
##  5 355   
##  6 356   
##  7 357   
##  8 358   
##  9 359   
## 10 360   
## # … with 90 more rows
##   [1] "350" "351" "353" "354" "355" "356" "357" "358" "359" "360" "361" "362"
##  [13] "363" "364" "366" "367" "368" "369" "372" "373" "374" "375" "377" "378"
##  [25] "379" "381" "382" "383" "384" "387" "388" "389" "390" "391" "392" "393"
##  [37] "394" "395" "396" "397" "398" "399" "400" "401" "402" "403" "405" "406"
##  [49] "407" "408" "409" "410" "411" "412" "414" "415" "416" "417" "419" "420"
##  [61] "421" "427" "429" "430" "432" "433" "434" "435" "440" "443" "444" "445"
##  [73] "460" "462" "463" "466" "467" "468" "471" "472" "473" "474" "475" "476"
##  [85] "477" "478" "479" "480" "481" "482" "483" "486" "487" "488" "489" "490"
##  [97] "491" "492" "493" "495"
##   [1] "350" "351" "353" "354" "355" "356" "357" "358" "359" "360" "361" "362"
##  [13] "363" "364" "366" "367" "368" "369" "372" "373" "374" "375" "377" "378"
##  [25] "379" "381" "382" "383" "384" "387" "388" "389" "390" "391" "392" "393"
##  [37] "394" "395" "396" "397" "398" "399" "400" "401" "402" "403" "405" "406"
##  [49] "407" "408" "409" "410" "411" "412" "414" "415" "416" "417" "419" "420"
##  [61] "421" "427" "429" "430" "432" "433" "434" "435" "440" "443" "444" "445"
##  [73] "460" "462" "463" "466" "467" "468" "471" "472" "473" "474" "475" "476"
##  [85] "477" "478" "479" "480" "481" "482" "483" "486" "487" "488" "489" "490"
##  [97] "491" "492" "493" "495"

8.2 Converting Tibbles - Back and Forth

## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##     <int>  <int>   <int>  <int>   <int>
##  1    350    204       0    103       0
##  2    351  15586     479  10476      39
##  3    353    842     355   1188      86
##  4    354      0       0      0       0
##  5    355    123     291    139      16
##  6    356      1       1      0       0
##  7    357    380       3    177       0
##  8    358    572    2225    597    4051
##  9    359      0      12      1       0
## 10    360    320     502     46    1114
## # … with 90 more rows
## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##    <chr>   <int>   <int>  <int>   <int>
##  1 350       204       0    103       0
##  2 351     15586     479  10476      39
##  3 353       842     355   1188      86
##  4 354         0       0      0       0
##  5 355       123     291    139      16
##  6 356         1       1      0       0
##  7 357       380       3    177       0
##  8 358       572    2225    597    4051
##  9 359         0      12      1       0
## 10 360       320     502     46    1114
## # … with 90 more rows
##     ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
## 1      350    204       0    103       0
## 2      351  15586     479  10476      39
## 3      353    842     355   1188      86
## 4      354      0       0      0       0
## 5      355    123     291    139      16
## 6      356      1       1      0       0
## 7      357    380       3    177       0
## 8      358    572    2225    597    4051
## 9      359      0      12      1       0
## 10     360    320     502     46    1114
## 11     361      0       1      0       0
## 12     362      3       1     15       0
## 13     363     14       6      4       1
## 14     364      7       0      1       0
## 15     366      6       0      1       0
## 16     367     42       0     51       1
## 17     368     28       0     24       0
## 18     369   1204    1034    833     478
## 19     372   2829    1864   2741     771
## 20     373    179     728    148     795
## 21     374     76       5    138       2
## 22     375   4428    6697   4970    4328
## 23     377   3170     314   2576      11
## 24     378   1839    4845   1767    2975
## 25     379    178       1    181       0
## 26     381   1617     574   1159     339
## 27     382   2874    2265   1746    1668
## 28     383     63    1632     40     721
## 29     384    148   10977    118      94
## 30     387   8899    2457   7405    1228
## 31     388  12598     171   5090      70
## 32     389   2709     193   2313       5
## 33     390   1004       0    395       0
## 34     391   1038     577   1176     164
## 35     392   1527     304    786      71
## 36     393   2949     138   1540       3
## 37     394   1525     464   1062     134
## 38     395    348      67    123       0
## 39     396   6503     702   4723     169
## 40     397  12997     410  11265      38
## 41     398      0       0      0       0
## 42     399    223      11    422       2
## 43     400   1188     147    806      56
## 44     401      0       0      0       0
## 45     402    504     218    496      80
## 46     403    289      25    166       4
## 47     405   1481     824   1004     812
## 48     406    295     175     87      35
## 49     407      4       1      2       1
## 50     408   2451     111   1523       6
## 51     409   2480    1819   1356     226
## 52     410    433     197    215      77
## 53     411    829     217    441     131
## 54     412    312      45    138      17
## 55     414    516      15    396       9
## 56     415     20       0     13       0
## 57     416      2       1      4       0
## 58     417      0       0      0       0
## 59     419      6       5      2       7
## 60     420    141    1136     94    1217
## 61     421    213     255     93     208
## 62     427   4699   11889   1729     926
## 63     429      0       0      2       0
## 64     430     34      13     22       0
## 65     432     63       0     55       1
## 66     433     38       0     26       0
## 67     434      1       1      2       0
## 68     435    408     151    284      34
## 69     440    157    2520    111     535
## 70     443      5       0      4       0
## 71     444   1583     151    747      14
## 72     445    116      15     90       1
## 73     460     34       0     68       0
## 74     462     70       0     31       1
## 75     463   1244     118    492      71
## 76     466    538     480    393     218
## 77     467   2506     402   2130      18
## 78     468   7991    6132   5307    1883
## 79     471   1272     771   1392      53
## 80     472   1389     628    739     138
## 81     473   4173     783   1901     776
## 82     474      0       0      0       0
## 83     475    284      83    467      34
## 84     476   4952    3453   4202    1416
## 85     477     13       3     26       2
## 86     478     78     121     67       0
## 87     479     17       0      4       0
## 88     480      9       5     11       1
## 89     481   1937      18   1017      34
## 90     482    157    1392     75    1660
## 91     483   1075    1454   1789    1141
## 92     486     47       0     18       0
## 93     487     29      33     19       3
## 94     488   4529    1118   2925     269
## 95     489   3465     153   3188       8
## 96     490   1610    1263    913     665
## 97     491     12       1      4       0
## 98     492      4       0      6       0
## 99     493   5011    3585   3053     743
## 100    495      0       0      0       0

9 Tidying data up

9.1 What is wrong with this dataframe from a tidy viewpoint?

  • Each variable has its own column
  • Each observation has its own row
  • Each value has its own cell
## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##    <chr>   <int>   <int>  <int>   <int>
##  1 350       204       0    103       0
##  2 351     15586     479  10476      39
##  3 353       842     355   1188      86
##  4 354         0       0      0       0
##  5 355       123     291    139      16
##  6 356         1       1      0       0
##  7 357       380       3    177       0
##  8 358       572    2225    597    4051
##  9 359         0      12      1       0
## 10 360       320     502     46    1114
## # … with 90 more rows

A single variable with multiple columns

9.2 How do we get tidy? - Pivot tools (formerly known as gather/spread )

## # A tibble: 100 x 5
##    ENTREZ CD34_1 ORTHO_1 CD34_2 ORTHO_2
##    <chr>   <int>   <int>  <int>   <int>
##  1 350       204       0    103       0
##  2 351     15586     479  10476      39
##  3 353       842     355   1188      86
##  4 354         0       0      0       0
##  5 355       123     291    139      16
##  6 356         1       1      0       0
##  7 357       380       3    177       0
##  8 358       572    2225    597    4051
##  9 359         0      12      1       0
## 10 360       320     502     46    1114
## # … with 90 more rows

9.3 What next?

## # A tibble: 400 x 3
##    ENTREZ Sample  counts
##    <chr>  <chr>    <int>
##  1 350    CD34_1     204
##  2 350    ORTHO_1      0
##  3 350    CD34_2     103
##  4 350    ORTHO_2      0
##  5 351    CD34_1   15586
##  6 351    ORTHO_1    479
##  7 351    CD34_2   10476
##  8 351    ORTHO_2     39
##  9 353    CD34_1     842
## 10 353    ORTHO_1    355
## # … with 390 more rows

Multiple variables in a single column

9.4 Dealing with multiple variables in a single column

How do we get tidy? - Cleaning up

## # A tibble: 400 x 4
##    ENTREZ CellType Rep   counts
##    <chr>  <chr>    <chr>  <int>
##  1 350    CD34     1        204
##  2 350    ORTHO    1          0
##  3 350    CD34     2        103
##  4 350    ORTHO    2          0
##  5 351    CD34     1      15586
##  6 351    ORTHO    1        479
##  7 351    CD34     2      10476
##  8 351    ORTHO    2         39
##  9 353    CD34     1        842
## 10 353    ORTHO    1        355
## # … with 390 more rows
## # A tibble: 400 x 5
##    ENTREZ Sample  CellType Rep   counts
##    <chr>  <chr>   <chr>    <chr>  <int>
##  1 350    CD34_1  CD34     1        204
##  2 350    ORTHO_1 ORTHO    1          0
##  3 350    CD34_2  CD34     2        103
##  4 350    ORTHO_2 ORTHO    2          0
##  5 351    CD34_1  CD34     1      15586
##  6 351    ORTHO_1 ORTHO    1        479
##  7 351    CD34_2  CD34     2      10476
##  8 351    ORTHO_2 ORTHO    2         39
##  9 353    CD34_1  CD34     1        842
## 10 353    ORTHO_1 ORTHO    1        355
## # … with 390 more rows

11 Readr again: Writing your lovely new tibble to file

11.1 At this point we have covered or touched on the most essential facets of tidy

  • ggplot2 – making pretty graphs
  • readr – reading data into R
  • dplyr – manipulating data
  • tibble - working with tibbles
  • tidyr – miscellaneous tools for tidying data
  • purrr - iterating over data
  • stringr – working with strings
  • forcats - working with factors

12 purrr - Functional programming

Applying functions to datasets

Base people use for loops or apply

Big advantage it handles nested dataframes and has standard outputs

12.1 map - tidy way to iterate over a dataset

## $CD34_1
## [1] 1497.67
## 
## $ORTHO_1
## [1] 822.33
## 
## $CD34_2
## [1] 1056.85
## 
## $ORTHO_2
## [1] 329.05
##  CD34_1 ORTHO_1  CD34_2 ORTHO_2 
## 1497.67  822.33 1056.85  329.05
## # A tibble: 4 x 2
##   Sample  mean_counts
##   <chr>         <dbl>
## 1 CD34_1        1498.
## 2 CD34_2        1057.
## 3 ORTHO_1        822.
## 4 ORTHO_2        329.
##  CD34_1  CD34_2 ORTHO_1 ORTHO_2 
## 1497.67 1056.85  822.33  329.05

12.2 Nest - simplifying your dataframe by making it more complex

## # A tibble: 4 x 2
## # Groups:   Sample [4]
##   Sample             data
##   <chr>   <list<df[,10]>>
## 1 CD34_1        [94 × 10]
## 2 ORTHO_1       [94 × 10]
## 3 CD34_2        [94 × 10]
## 4 ORTHO_2       [94 × 10]
## [1] "vctrs_list_of" "vctrs_vctr"    "oldClass"
## # A tibble: 94 x 10
##    ENTREZ CellType Rep   counts count_total      CPM SYMBOL CHR   LENGTH     TPM
##    <chr>  <chr>    <chr>  <int>       <int>    <dbl> <chr>  <chr>  <int>   <dbl>
##  1 350    CD34     1        204         307   1.36e3 APOH   chr17   1201  1.13e3
##  2 351    CD34     1      15586       26580   1.04e5 APP    chr21   4480  2.32e4
##  3 353    CD34     1        842        2471   5.62e3 APRT   chr16    807  6.97e3
##  4 355    CD34     1        123         569   8.21e2 FAS    chr10   6691  1.23e2
##  5 356    CD34     1          1           2   6.68e0 FASLG  chr1    1859  3.59e0
##  6 357    CD34     1        380         560   2.54e3 SHROO… chrX    8206  3.09e2
##  7 358    CD34     1        572        7445   3.82e3 AQP1   chr7    3786  1.01e3
##  8 359    CD34     1          0          13   0.     AQP2   chr12   4179  0.    
##  9 360    CD34     1        320        1982   2.14e3 AQP3   chr9    2950  7.24e2
## 10 361    CD34     1          0           1   0.     AQP4   chr18   5217  0.    
## # … with 84 more rows
## # A tibble: 4 x 3
## # Groups:   Sample [4]
##   Sample             data my_model
##   <chr>   <list<df[,10]>> <list>  
## 1 CD34_1        [94 × 10] <lm>    
## 2 ORTHO_1       [94 × 10] <lm>    
## 3 CD34_2        [94 × 10] <lm>    
## 4 ORTHO_2       [94 × 10] <lm>
## [1] "list"             "vector"           "AssayData"        "list_OR_List"    
## [5] "vector_OR_factor"
## 
## Call:
## lm(formula = CPM ~ TPM, data = .)
## 
## Coefficients:
## (Intercept)          TPM  
##    3864.980        1.723
## # A tibble: 4 x 4
## # Groups:   Sample [4]
##   Sample             data my_model my_tidy_model   
##   <chr>   <list<df[,10]>> <list>   <list>          
## 1 CD34_1        [94 × 10] <lm>     <tibble [2 × 5]>
## 2 ORTHO_1       [94 × 10] <lm>     <tibble [2 × 5]>
## 3 CD34_2        [94 × 10] <lm>     <tibble [2 × 5]>
## 4 ORTHO_2       [94 × 10] <lm>     <tibble [2 × 5]>
## # A tibble: 2 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  3865.    1101.         3.51 6.94e- 4
## 2 TPM             1.72     0.108     16.0  2.75e-28

12.3 Unnest - Expand out dataframes

## # A tibble: 8 x 8
## # Groups:   Sample [4]
##   Sample           data my_model term      estimate std.error statistic  p.value
##   <chr>   <list<df[,10> <list>   <chr>        <dbl>     <dbl>     <dbl>    <dbl>
## 1 CD34_1      [94 × 10] <lm>     (Interce…  3865.   1101.          3.51 6.94e- 4
## 2 CD34_1      [94 × 10] <lm>     TPM           1.72    0.108      16.0  2.75e-28
## 3 ORTHO_1     [94 × 10] <lm>     (Interce…  1522.    784.          1.94 5.53e- 2
## 4 ORTHO_1     [94 × 10] <lm>     TPM           2.20    0.0696     31.6  3.83e-51
## 5 CD34_2      [94 × 10] <lm>     (Interce…  4083.   1085.          3.76 2.94e- 4
## 6 CD34_2      [94 × 10] <lm>     TPM           1.54    0.0944     16.3  7.03e-29
## 7 ORTHO_2     [94 × 10] <lm>     (Interce…  1787.    937.          1.91 5.96e- 2
## 8 ORTHO_2     [94 × 10] <lm>     TPM           2.21    0.0884     25.0  8.45e-43
## # A tibble: 752 x 17
## # Groups:   Sample [4]
##    Sample ENTREZ CellType Rep   counts count_total    CPM SYMBOL CHR   LENGTH
##    <chr>  <chr>  <chr>    <chr>  <int>       <int>  <dbl> <chr>  <chr>  <int>
##  1 CD34_1 350    CD34     1        204         307 1.36e3 APOH   chr17   1201
##  2 CD34_1 351    CD34     1      15586       26580 1.04e5 APP    chr21   4480
##  3 CD34_1 353    CD34     1        842        2471 5.62e3 APRT   chr16    807
##  4 CD34_1 355    CD34     1        123         569 8.21e2 FAS    chr10   6691
##  5 CD34_1 356    CD34     1          1           2 6.68e0 FASLG  chr1    1859
##  6 CD34_1 357    CD34     1        380         560 2.54e3 SHROO… chrX    8206
##  7 CD34_1 358    CD34     1        572        7445 3.82e3 AQP1   chr7    3786
##  8 CD34_1 359    CD34     1          0          13 0.     AQP2   chr12   4179
##  9 CD34_1 360    CD34     1        320        1982 2.14e3 AQP3   chr9    2950
## 10 CD34_1 361    CD34     1          0           1 0.     AQP4   chr18   5217
## # … with 742 more rows, and 7 more variables: TPM <dbl>, my_model <list>,
## #   term <chr>, estimate <dbl>, std.error <dbl>, statistic <dbl>, p.value <dbl>

13 Stringr

If the data you are working with involves characters from data entry often there will be errors i.e. clinical study metadata or a hand-typed list of genes of interest. Tidying data also means fixing these problems. Stringr helps make this easy.

  • Access and manipulate Characters
  • Deal with whitspace
  • Pattern Recognition

Though stringr is pretty comprehensive and covers most of what you will need, there is a sister package called stringi with even more functionality.

13.3 Converting strings - Capitalization

## [1] "APOH" "APOH" "APOH" "APOH" "APP"  "APP"
## [1] "Apoh" "Apoh" "Apoh" "Apoh" "App"  "App"
## # A tibble: 376 x 11
## # Groups:   Sample [4]
##    ENTREZ Sample CellType Rep   counts count_total    CPM SYMBOL CHR   LENGTH
##    <chr>  <chr>  <chr>    <chr>  <int>       <int>  <dbl> <chr>  <chr>  <int>
##  1 350    CD34_1 CD34     1        204         307 1.36e3 Apoh   chr17   1201
##  2 350    ORTHO… ORTHO    1          0         307 0.     Apoh   chr17   1201
##  3 350    CD34_2 CD34     2        103         307 9.75e2 Apoh   chr17   1201
##  4 350    ORTHO… ORTHO    2          0         307 0.     Apoh   chr17   1201
##  5 351    CD34_1 CD34     1      15586       26580 1.04e5 App    chr21   4480
##  6 351    ORTHO… ORTHO    1        479       26580 5.82e3 App    chr21   4480
##  7 351    CD34_2 CD34     2      10476       26580 9.91e4 App    chr21   4480
##  8 351    ORTHO… ORTHO    2         39       26580 1.19e3 App    chr21   4480
##  9 353    CD34_1 CD34     1        842        2471 5.62e3 Aprt   chr16    807
## 10 353    ORTHO… ORTHO    1        355        2471 4.32e3 Aprt   chr16    807
## # … with 366 more rows, and 1 more variable: TPM <dbl>
## # A tibble: 376 x 11
## # Groups:   Sample [4]
##    ENTREZ Sample CellType Rep   counts count_total    CPM SYMBOL CHR   LENGTH
##    <chr>  <chr>  <chr>    <chr>  <int>       <int>  <dbl> <chr>  <chr>  <int>
##  1 350    CD34_1 CD34     1        204         307 1.36e3 APOH   CHR17   1201
##  2 350    ORTHO… ORTHO    1          0         307 0.     APOH   CHR17   1201
##  3 350    CD34_2 CD34     2        103         307 9.75e2 APOH   CHR17   1201
##  4 350    ORTHO… ORTHO    2          0         307 0.     APOH   CHR17   1201
##  5 351    CD34_1 CD34     1      15586       26580 1.04e5 APP    CHR21   4480
##  6 351    ORTHO… ORTHO    1        479       26580 5.82e3 APP    CHR21   4480
##  7 351    CD34_2 CD34     2      10476       26580 9.91e4 APP    CHR21   4480
##  8 351    ORTHO… ORTHO    2         39       26580 1.19e3 APP    CHR21   4480
##  9 353    CD34_1 CD34     1        842        2471 5.62e3 APRT   CHR16    807
## 10 353    ORTHO… ORTHO    1        355        2471 4.32e3 APRT   CHR16    807
## # … with 366 more rows, and 1 more variable: TPM <dbl>

13.4 Finding patterns

##  [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [61]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
## [73]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE
## [85]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [97]  TRUE
##  [1] "Chinook salmon" "Sockeye salmon" "Sockeye salmon" "Chinook salmon"
##  [5] "Sockeye salmon" "Chinook salmon" "Chinook salmon" "Sockeye salmon"
##  [9] "Chinook salmon" "Chinook salmon" "Sockeye salmon" "Chinook salmon"
## [13] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [17] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Coho salmon"   
## [21] "Coho salmon"    "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [25] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [29] "Sockeye salmon" "Sockeye salmon" "Sockeye salmon" "Sockeye salmon"
## [33] "Sockeye salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [37] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [41] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Sockeye salmon"
## [45] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [49] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [53] "Chinook salmon" "Chinook salmon" "Chinook salmon" "Chinook salmon"
## [57] "Chinook salmon" "Chinook salmon" "Chinook salmon"
##  [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [61]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
## [73]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE
## [85]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [97]  TRUE
## # A tibble: 59 x 5
##    salmon_id common_name    age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>          <chr>                  <dbl>      <dbl>
##  1     35032 Chinook salmon yearling                 147       41.3
##  2     35035 Sockeye salmon juvenile                 121       NA  
##  3     35036 Sockeye salmon juvenile                 112       NA  
##  4     35033 Chinook salmon mixed age juvenile       444       62.1
##  5     35034 Sockeye salmon juvenile                 139       NA  
##  6     35142 Chinook salmon yearling                 149       66.5
##  7     35143 Chinook salmon yearling                 204       80.9
##  8     35144 Sockeye salmon juvenile                 140       NA  
##  9     35145 Chinook salmon yearling                 130       23.4
## 10     35146 Chinook salmon mixed age juvenile       422      101. 
## # … with 49 more rows
##  [1] 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0
## [39] 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1
## [77] 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [1] 3 2 2 0 0 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 2 3 3 2 0 0
## [39] 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 0 3 0 0 3 3 3 3 3 3 3 3
## [77] 3 2 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
##  [1] "Chinook salmon"  "Sockeye salmon"  "Sockeye salmon"  "Steelhead trout"
##  [5] "Steelhead trout" "Chinook salmon"  "Sockeye salmon"  "Steelhead trout"
##  [9] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [13] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [17] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [21] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [25] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [29] "Steelhead trout" "Steelhead trout" "Chinook salmon"  "Chinook salmon" 
## [33] "Sockeye salmon"  "Chinook salmon"  "Chinook salmon"  "Sockeye salmon" 
## [37] "Steelhead trout" "Steelhead trout" "Steelhead trout" "Steelhead trout"
## [41] "Steelhead trout" "Steelhead trout" "Chinook salmon"  "Chinook salmon" 
## [45] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [49] "Chinook salmon"  "Chinook salmon"  "Coho salmon"     "Coho salmon"    
## [53] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [57] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Sockeye salmon" 
## [61] "Sockeye salmon"  "Sockeye salmon"  "Sockeye salmon"  "Sockeye salmon" 
## [65] "Steelhead trout" "Chinook salmon"  "Steelhead trout" "Steelhead trout"
## [69] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [73] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [77] "Chinook salmon"  "Sockeye salmon"  "Steelhead trout" "Steelhead trout"
## [81] "Steelhead trout" "Steelhead trout" "Chinook salmon"  "Chinook salmon" 
## [85] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [89] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [93] "Chinook salmon"  "Chinook salmon"  "Chinook salmon"  "Chinook salmon" 
## [97] "Chinook salmon"
## # A tibble: 97 x 5
##    salmon_id common_name     age_classbylength  length_mm IGF1_ng_ml
##        <dbl> <chr>           <chr>                  <dbl>      <dbl>
##  1     35032 Chinook salmon  yearling                 147       41.3
##  2     35035 Sockeye salmon  juvenile                 121       NA  
##  3     35036 Sockeye salmon  juvenile                 112       NA  
##  4     35037 Steelhead trout juvenile                 220       42.7
##  5     35038 Steelhead trout juvenile                 152       NA  
##  6     35033 Chinook salmon  mixed age juvenile       444       62.1
##  7     35034 Sockeye salmon  juvenile                 139       NA  
##  8     35048 Steelhead trout juvenile                 288       24.2
##  9     35049 Steelhead trout juvenile                 190       NA  
## 10     35050 Steelhead trout juvenile                 283       63.5
## # … with 87 more rows

14 forcats - Handling factors

Factors are a data type that R uses to handle fixed categorical variables that have a known set of possible values.

Factors are ordered, allowing hierachy to be presevred in relatively simple vectors.

14.1 Making a factor - This is all base

##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: CD34_1 CD34_2 ORTHO_1 ORTHO_2
## # A tibble: 376 x 11
##    ENTREZ Sample CellType Rep   counts count_total    CPM SYMBOL CHR   LENGTH
##    <chr>  <fct>  <chr>    <chr>  <int>       <int>  <dbl> <chr>  <chr>  <int>
##  1 350    CD34_1 CD34     1        204         307 1.36e3 APOH   chr17   1201
##  2 350    ORTHO… ORTHO    1          0         307 0.     APOH   chr17   1201
##  3 350    CD34_2 CD34     2        103         307 9.75e2 APOH   chr17   1201
##  4 350    ORTHO… ORTHO    2          0         307 0.     APOH   chr17   1201
##  5 351    CD34_1 CD34     1      15586       26580 1.04e5 APP    chr21   4480
##  6 351    ORTHO… ORTHO    1        479       26580 5.82e3 APP    chr21   4480
##  7 351    CD34_2 CD34     2      10476       26580 9.91e4 APP    chr21   4480
##  8 351    ORTHO… ORTHO    2         39       26580 1.19e3 APP    chr21   4480
##  9 353    CD34_1 CD34     1        842        2471 5.62e3 APRT   chr16    807
## 10 353    ORTHO… ORTHO    1        355        2471 4.32e3 APRT   chr16    807
## # … with 366 more rows, and 1 more variable: TPM <dbl>
##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: CD34_1 ORTHO_1 CD34_2 ORTHO_2
##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: ORTHO_1 ORTHO_2 CD34_1 CD34_2
##   [1] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
##  [10] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
##  [19] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
##  [28] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
##  [37] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
##  [46] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
##  [55] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
##  [64] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
##  [73] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
##  [82] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
##  [91] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [100] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [109] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [118] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [127] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [136] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [145] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [154] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [163] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [172] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [181] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [190] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [199] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [208] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [217] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [226] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [235] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [244] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [253] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [262] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [271] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [280] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [289] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [298] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [307] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [316] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [325] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [334] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1
## [343] <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>   
## [352] <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## [361] CD34_1  ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>    CD34_1 
## [370] ORTHO_1 <NA>    <NA>    CD34_1  ORTHO_1 <NA>    <NA>   
## Levels: ORTHO_1 CD34_1
## [1] "CD34_1"  "CD34_2"  "ORTHO_1" "ORTHO_2"

14.3 Changing the order

##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: ORTHO_1 ORTHO_2 CD34_1 CD34_2
##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: ORTHO_1 CD34_1 CD34_2 ORTHO_2
##   [1] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [10] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [19] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [28] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [37] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [46] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [55] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
##  [64] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
##  [73] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
##  [82] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
##  [91] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [100] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [109] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [118] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [127] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [136] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [145] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [154] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [163] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [172] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [181] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [190] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [199] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [208] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [217] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [226] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [235] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [244] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [253] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [262] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [271] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [280] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [289] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [298] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [307] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [316] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [325] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [334] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1
## [343] CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2 
## [352] ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## [361] CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2 CD34_1 
## [370] ORTHO_1 CD34_2  ORTHO_2 CD34_1  ORTHO_1 CD34_2  ORTHO_2
## Levels: ORTHO_2 ORTHO_1 CD34_2 CD34_1

14.5 Other useful things.

##  [1] "yearling"           "juvenile"           "juvenile"          
##  [4] "juvenile"           "juvenile"           "mixed age juvenile"
##  [7] "juvenile"           "juvenile"           "juvenile"          
## [10] "juvenile"           "juvenile"           "juvenile"          
## [13] "juvenile"           "juvenile"           "juvenile"          
## [16] "juvenile"           "juvenile"           "juvenile"          
## [19] "juvenile"           "juvenile"           "juvenile"          
## [22] "juvenile"           "juvenile"           "juvenile"          
## [25] "juvenile"           "juvenile"           "juvenile"          
## [28] "juvenile"           "juvenile"           "juvenile"          
## [31] "yearling"           "yearling"           "juvenile"          
## [34] "yearling"           "mixed age juvenile" "juvenile"          
## [37] "juvenile"           "juvenile"           "juvenile"          
## [40] "juvenile"           "juvenile"           "juvenile"          
## [43] "yearling"           "yearling"           "yearling"          
## [46] "yearling"           "yearling"           "yearling"          
## [49] "yearling"           "yearling"           "yearling"          
## [52] "yearling"           "yearling"           "yearling"          
## [55] "yearling"           "yearling"           "yearling"          
## [58] "yearling"           "subyearling"        "juvenile"          
## [61] "juvenile"           "juvenile"           "juvenile"          
## [64] "juvenile"           "juvenile"           "yearling"          
## [67] "juvenile"           "juvenile"           "mixed age juvenile"
## [70] "yearling"           "yearling"           "yearling"          
## [73] "yearling"           "yearling"           "yearling"          
## [76] "yearling"           "yearling"           "juvenile"          
## [79] "juvenile"           "juvenile"           "juvenile"          
## [82] "juvenile"           "yearling"           "yearling"          
## [85] "yearling"           "yearling"           "yearling"          
## [88] "yearling"           "yearling"           "yearling"          
## [91] "yearling"           "yearling"           "yearling"          
## [94] "yearling"           "yearling"           "yearling"          
## [97] "yearling"
##  [1] yearling       juvenile       juvenile       juvenile       juvenile      
##  [6] mixed juvenile juvenile       juvenile       juvenile       juvenile      
## [11] juvenile       juvenile       juvenile       juvenile       juvenile      
## [16] juvenile       juvenile       juvenile       juvenile       juvenile      
## [21] juvenile       juvenile       juvenile       juvenile       juvenile      
## [26] juvenile       juvenile       juvenile       juvenile       juvenile      
## [31] yearling       yearling       juvenile       yearling       mixed juvenile
## [36] juvenile       juvenile       juvenile       juvenile       juvenile      
## [41] juvenile       juvenile       yearling       yearling       yearling      
## [46] yearling       yearling       yearling       yearling       yearling      
## [51] yearling       yearling       yearling       yearling       yearling      
## [56] yearling       yearling       yearling       subyearling    juvenile      
## [61] juvenile       juvenile       juvenile       juvenile       juvenile      
## [66] yearling       juvenile       juvenile       mixed juvenile yearling      
## [71] yearling       yearling       yearling       yearling       yearling      
## [76] yearling       yearling       juvenile       juvenile       juvenile      
## [81] juvenile       juvenile       yearling       yearling       yearling      
## [86] yearling       yearling       yearling       yearling       yearling      
## [91] yearling       yearling       yearling       yearling       yearling      
## [96] yearling       yearling      
## Levels: yearling juvenile mixed juvenile subyearling
##  [1] yearling    juvenile    juvenile    juvenile    juvenile    juvenile   
##  [7] juvenile    juvenile    juvenile    juvenile    juvenile    juvenile   
## [13] juvenile    juvenile    juvenile    juvenile    juvenile    juvenile   
## [19] juvenile    juvenile    juvenile    juvenile    juvenile    juvenile   
## [25] juvenile    juvenile    juvenile    juvenile    juvenile    juvenile   
## [31] yearling    yearling    juvenile    yearling    juvenile    juvenile   
## [37] juvenile    juvenile    juvenile    juvenile    juvenile    juvenile   
## [43] yearling    yearling    yearling    yearling    yearling    yearling   
## [49] yearling    yearling    yearling    yearling    yearling    yearling   
## [55] yearling    yearling    yearling    yearling    subyearling juvenile   
## [61] juvenile    juvenile    juvenile    juvenile    juvenile    yearling   
## [67] juvenile    juvenile    juvenile    yearling    yearling    yearling   
## [73] yearling    yearling    yearling    yearling    yearling    juvenile   
## [79] juvenile    juvenile    juvenile    juvenile    yearling    yearling   
## [85] yearling    yearling    yearling    yearling    yearling    yearling   
## [91] yearling    yearling    yearling    yearling    yearling    yearling   
## [97] yearling   
## Levels: yearling juvenile subyearling
## # A tibble: 4 x 2
##   f                      n
##   <fct>              <int>
## 1 yearling              44
## 2 juvenile              49
## 3 mixed age juvenile     3
## 4 subyearling            1
##  [1] yearling juvenile juvenile juvenile juvenile Other    juvenile juvenile
##  [9] juvenile juvenile juvenile juvenile juvenile juvenile juvenile juvenile
## [17] juvenile juvenile juvenile juvenile juvenile juvenile juvenile juvenile
## [25] juvenile juvenile juvenile juvenile juvenile juvenile yearling yearling
## [33] juvenile yearling Other    juvenile juvenile juvenile juvenile juvenile
## [41] juvenile juvenile yearling yearling yearling yearling yearling yearling
## [49] yearling yearling yearling yearling yearling yearling yearling yearling
## [57] yearling yearling Other    juvenile juvenile juvenile juvenile juvenile
## [65] juvenile yearling juvenile juvenile Other    yearling yearling yearling
## [73] yearling yearling yearling yearling yearling juvenile juvenile juvenile
## [81] juvenile juvenile yearling yearling yearling yearling yearling yearling
## [89] yearling yearling yearling yearling yearling yearling yearling yearling
## [97] yearling
## Levels: yearling juvenile Other
## # A tibble: 3 x 2
##   f            n
##   <fct>    <int>
## 1 yearling    44
## 2 juvenile    49
## 3 Other        4

##CHALEENGE # (short verse long gnnes?)

14.6 Tidy beyond this workshop

Hadley Wickham (Chief Scientist at RStudio) is the driving force behind the tidyverse.

Hadley wrote a paper about why he thinks tidy data is best: www.jstatsoft.org/v59/i10/paper.

There is a lot of support for all things tidy at: https://www.tidyverse.org/

14.7 Tidy packages to check out:

readxl: This package is very useful when you want to import Excel sheets in R googledrive: Interact with your googledrive through R

lubridate and hms: Allow managin of calendar and time formats

magrittr:

broom: helps tidy up standard base function i.e. lm or t.test

tidymodels:

14.8 Other Good Resources

GGplot here

tidy workbook

15 BRC outro